135 research outputs found

    A preliminary evaluation of text-based and dependency-based techniques for determining the origins of bugs

    Get PDF
    A crucial step in understanding the life cycle of software bugs is identifying their origin. Unfortunately this information is not usually recorded and recovering it at a later date is challenging. Recently two approaches have been developed that attempt to solve this problem: the text approach and the dependency approach. However only limited evaluation has been carried out on their effectiveness so far, partially due to the lack of data sets linking bugs to their introduction. Producing such data sets is both time-consuming and challenging due to the subjective nature of the problem. To improve this, the origins of 166 bugs in two open-source projects were manually identified. These were then compared to a simulation of the approaches. The results show that both approaches were partially successful across a variety of different types of bugs. They achieved a precision of 29%{79% and a recall of 40%{70%, and could perform better when combined. However there remain a number of challenges to overcome in future development|large commits, unrelated changes and large numbers of versions between the origin and the x all reduce their effectiveness

    A comparative evaluation of dynamic visualisation tools

    Get PDF
    Despite their potential applications in software comprehension, it appears that dynamic visualisation tools are seldom used outside the research laboratory. This paper presents an empirical evaluation of five dynamic visualisation tools - AVID, Jinsight, jRMTool, Together ControlCenter diagrams and Together ControlCenter debugger. The tools were evaluated on a number of general software comprehension and specific reverse engineering tasks using the HotDraw objectoriented framework. The tasks considered typical comprehension issues, including identification of software structure and behaviour, design pattern extraction, extensibility potential, maintenance issues, functionality location, and runtime load. The results revealed that the level of abstraction employed by a tool affects its success in different tasks, and that tools were more successful in addressing specific reverse engineering tasks than general software comprehension activities. It was found that no one tool performs well in all tasks, and some tasks were beyond the capabilities of all five tools. This paper concludes with suggestions for improving the efficacy of such tools

    Automatically classifying test results by semi-supervised learning

    Get PDF
    A key component of software testing is deciding whether a test case has passed or failed: an expensive and error-prone manual activity. We present an approach to automatically classify passing and failing executions using semi-supervised learning on dynamic execution data (test inputs/outputs and execution traces). A small proportion of the test data is labelled as passing or failing and used in conjunction with the unlabelled data to build a classifier which labels the remaining outputs (classify them as passing or failing tests). A range of learning algorithms are investigated using several faulty versions of three systems along with varying types of data (inputs/outputs alone, or in combination with execution traces) and different labelling strategies (both failing and passing tests, and passing tests alone). The results show that in many cases labelling just a small proportion of the test cases – as low as 10% – is sufficient to build a classifier that is able to correctly categorise the large majority of the remaining test cases. This has important practical potential: when checking the test results from a system a developer need only examine a small proportion of these and use this information to train a learning algorithm to automatically classify the remainder

    Building test oracles by clustering failures

    Get PDF
    In recent years, software testing research has produced notable advances in the area of automated test data generation, but the corresponding oracle problem (a mechanism for determine the (in)correctness of an executed test case) is still a major problem. In this paper, we present a preliminary study which investigates the application of anomaly detection techniques (based on clustering) to automatically build an oracle using a system’s input/output pairs, based on the hypothesis that failures will tend to group into small clusters. The fault detection capability of the approach is evaluated on two systems and the findings reveal that failing outputs do indeed tend to congregate in small clusters, suggesting that the approach is feasible and has the potential to reduce by an order of magnitude the numbers of outputs that would need to be manually examined following a test run

    A collaborative approach to learning programming: a hybrid learning model

    Get PDF
    The use of cooperative working as a means of developing collaborative skills has been recognised as vital in programming education. This paper presents results obtained from preliminary work to investigate the effectiveness of Pair Programming as a collaborative learning strategy and also its value towards improving programming skills within the laboratory. The potential of Problem Based Learning as a means of further developing cooperative working skills along with problem solving skills is also examined and a hybrid model encompassing both strategies outlined

    Comparing text-based and dependence-based approaches for determining the origins of bugs

    Get PDF
    Identifying bug origins – the point where erroneous code was introduced – is crucial for many software engineering activities, from identifying process weaknesses to gathering data to support bug detection tools. Unfortunately, this information is not usually recorded when fixing bugs, and recovering it later is challenging. Recently, the text approach and the dependence approach have been developed to tackle this problem. Respectively, they examine textual and dependence-related changes that occurred prior to a bug fix. However, only limited evaluation has been carried out, partially because of a lack of available implementations and of datasets linking bugs to origins. To address this, origins of 174 bugs in three projects were manually identified and compared to a simulation of the approaches. Both approaches were partially successful across a variety of bugs – achieving 29–79% precision and 40–70% recall. Results suggested the precise definition of program dependence could affect performance, as could whether the approaches identified a single or multiple origins. Some potential improvements are explored in detail and identify pragmatic strategies for combining techniques along with simple modifications. Even after adopting these improvements, there remain many challenges: large commits, unrelated changes and long periods between origins and fixes all reduce effectiveness

    Machine learning techniques for automated software fault detection via dynamic execution data : empirical evaluation study

    Get PDF
    The biggest obstacle of automated software testing is the construction of test oracles. Today, it is possible to generate enormous amount of test cases for an arbitrary system that reach a remarkably high level of coverage, but the effectiveness of test cases is limited by the availability of test oracles that can distinguish failing executions. Previous work by the authors has explored the use of unsupervised and semi-supervised learning techniques to develop test oracles so that the correctness of software outputs and behaviours on new test cases can be predicated [1], [2], [10], and experimental results demonstrate the promise of this approach. In this paper, we present an evaluation study for test oracles based on machine-learning approaches via dynamic execution data (firstly, input/output pairs and secondly, amalgamations of input/output pairs and execution traces) by comparing their effectiveness with existing techniques from the specification mining domain (the data invariant detector Daikon [5]). The two approaches are evaluated on a range of mid-sized systems and compared in terms of their fault detection ability and false positive rate. The empirical study also discuss the major limitations and the most important properties related to the application of machine learning techniques as test oracles in practice. The study also gives a road map for further research direction in order to tackle some of discussed limitations such as accuracy and scalability. The results show that in most cases semi-supervised learning techniques performed far better as an automated test classifier than Daikon (especially in the case that input/output pairs were augmented with their execution traces). However, there is one system for which our strategy struggles and Daikon performed far better. Furthermore, unsupervised learning techniques performed on a par when compared with Daikon in several cases particularly when input/output pairs were used together with execution traces

    The impact of ensemble techniques on software maintenance change prediction : an empirical study

    Get PDF
    Various prediction models have been proposed by researchers to predict the change-proneness of classes based on source code metrics. However, some of these models suffer from low prediction accuracy because datasets exhibit high dimensionality or imbalanced classes. Recent studies suggest that using ensembles to integrate several models, select features, or perform sampling has the potential to resolve issues in the datasets and improve the prediction accuracy. This study aims to empirically evaluate the effectiveness of the ensemble models, feature selection, and sampling techniques on predicting change-proneness using different metrics. We conduct an empirical study to compare the performance of four machine learning models (naive Bayes, support vector machines, k-nearest neighbors, and random forests) on seven datasets for predicting change-proneness. We use two types of feature selection (relief and Pearson’s correlation coefficient) and three types of ensemble sampling techniques, which integrate different types of sampling techniques (SMOTE, spread sub-sample, and randomize). The results of this study reveal that the ensemble feature selection and sampling techniques yield improved prediction accuracy over most of the investigated models, and using sampling techniques increased the prediction accuracy of all models. Random forests provide a significant improvement over other prediction models and obtained the highest value of the average of the area under curve in all scenarios. The proposed ensemble feature selection and sampling techniques, along with the ensemble model (random forests), were found beneficial in improving the prediction accuracy of change-proneness

    Using smartphones in cities to crowdsource dangerous road sections and give effective in-car warnings

    Get PDF
    The widespread day-to-day carrying of powerful smartphones gives opportunities for crowd-sourcing information about the users' activities to gain insight into patterns of use of a large population in cities. Here we report the design and initial investigations into a crowdsourcing approach for sudden decelerations to identify dangerous road sections. Sudden brakes and near misses are much more common than police reportable accidents but under exploited and have the potential for more responsive reaction than waiting for accidents. We also discuss different multimodal feedback conditions to warn drivers approaching a dangerous zone. We believe this crowdsourcing approach gives cost and coverage benefits over infrastructural smart-city approaches but that users need incentivized for use

    Development and validation of a digital biomarker predicting acute kidney injury following cardiac surgery on an hourly basis

    Get PDF
    Objectives To develop and validate a digital biomarker for predicting the onset of acute kidney injury (AKI) on an hourly basis up to 24 hours in advance in the intensive care unit after cardiac surgery. Methods The study analyzed data from 6056 adult patients undergoing coronary artery bypass graft (CABG) and/or valve surgery between 1st April 2012 and 31st December 2018 (development phase, training, and testing) and 3572 patients between 1st January 2019 and 30th June 2022 (validation phase). The study utilized two dynamic predictive modeling approaches, namely logistic regression and bootstrap aggregated regression trees machine (BARTm), to predict AKI. The mean area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and positive and negative predictive values (PPV and NPV) across all lead times before the occurrence of AKI were reported. The clinical practicality was assessed using calibration. Results Of all included patients, 8.45% and 16.66% had AKI in the development and validation phases, respectively. When applied to testing data, AKI was predicted with the mean AUC of 0.850 and 0.802 by BARTm and logistic regression, respectively. When applied to validation data, BARTm and LR resulted in a mean AUC of 0.844 and 0.786, respectively. Conclusions This study demonstrated the successful prediction of AKI on an hourly basis up to 24 hours in advance. The digital biomarkers developed and validated in this study have the potential to assist clinicians in optimizing treatment and implementing preventive strategies for patients at risk of developing AKI after cardiac surgery in the ICU
    • …
    corecore